CRITICAL ANALYSIS OF ELECTRIC VEHICLES IN CANADA

title.PNG

Introduction

What is our dataset about?

We have selected 5 data sets, hoping to understand the fuel efficiency and the suitability in different situations among certain vehicle types.

What are we trying to prove here?

Are Electrical Vehicles efficient?

Are Electrical Vehicles suitable for everyone?

Where is our dataset coming from?

Vehicle Sales Data :

vs.PNG

ev+phev+fc.PNG

Electric Vehicles Data :

Plug-in Hybrid Electric Vehicles Data :

ev.PNG

phev.PNG

Data Collection and Look Up

In [1]:
import pandas as pd
import pandas_profiling
import numpy as np
import matplotlib.pyplot as plt
%pylab inline
pylab.rcParams['figure.figsize'] = (12.0, 10.0)
import seaborn as sns
%matplotlib inline
from pandas.plotting import scatter_matrix
from scipy import stats
from scipy import stats, special
from sklearn import model_selection, metrics, linear_model, datasets, feature_selection
from seaborn import pairplot, heatmap
from sklearn.model_selection import train_test_split

# Importing All needed Libraries
Populating the interactive namespace from numpy and matplotlib

Reading files into pandas

In [2]:
vs = pd.read_excel('VehiclesSalesData.xls')

# Vehicles Sales is vs now to be easily recalled.
In [3]:
ev = pd.read_csv('2012-2020_Battery_Electric_Vehicles.csv')

# Battery Electric Vehicles is ev now to be easily recalled.
In [4]:
phev = pd.read_csv('2012-2020_Plug-in_Hybrid_Electric_Vehicles.csv')

# Plug in Hybrid Electric Vehicles is phev now to be easily recalled.

Take a look at the data and the first few rows

In [5]:
print("Vehicles Sales rows and columns:")
print(vs.shape)

print("Battery Electric Vehicles rows and columns:")
print(ev.shape)

print("Plug in Hybrid Electric Vehicles rows and columns:")
print(phev.shape)

# Number of rows and columns.
Vehicles Sales rows and columns:
(3465, 17)
Battery Electric Vehicles rows and columns:
(179, 18)
Plug in Hybrid Electric Vehicles rows and columns:
(137, 20)
In [6]:
print("Vehicles Sales First & Last 5 rows:")
print(vs.head)

print("Battery Electric Vehicles First & Last 5 rows:")
print(ev.head)

print("Plug in Hybrid Electric Vehicles First & Last 5 rows:")
print(phev.head)

# First and last 5 rows for each data.
Vehicles Sales First & Last 5 rows:
<bound method NDFrame.head of       REF_DATE                                   GEO           DGUID  \
0         2011                                Canada  2016A000011124   
1         2011                                Canada  2016A000011124   
2         2011                                Canada  2016A000011124   
3         2011                                Canada  2016A000011124   
4         2011                                Canada  2016A000011124   
...        ...                                   ...             ...   
3460      2019  British Columbia and the Territories     2016A000259   
3461      2019  British Columbia and the Territories     2016A000259   
3462      2019  British Columbia and the Territories     2016A000259   
3463      2019  British Columbia and the Territories     2016A000259   
3464      2019  British Columbia and the Territories     2016A000259   

             Fuel Type            Vehicle type          Statistics    UOM  \
0       All fuel types     Total, vehicle type  Number of vehicles  Units   
1       All fuel types          Passenger cars  Number of vehicles  Units   
2       All fuel types          Pick up trucks  Number of vehicles  Units   
3       All fuel types  Multi-purpose vehicles  Number of vehicles  Units   
4       All fuel types                    Vans  Number of vehicles  Units   
...                ...                     ...                 ...    ...   
3460  Other fuel types     Total, vehicle type  Number of vehicles  Units   
3461  Other fuel types          Passenger cars  Number of vehicles  Units   
3462  Other fuel types          Pick up trucks  Number of vehicles  Units   
3463  Other fuel types  Multi-purpose vehicles  Number of vehicles  Units   
3464  Other fuel types                    Vans  Number of vehicles  Units   

      UOM_ID SCALAR_FACTOR  SCALAR_ID       VECTOR COORDINATE      VALUE  \
0        300        units           0  v1079014832    1.1.1.1  1623288.0   
1        300        units           0  v1079014839    1.1.2.1   667842.0   
2        300        units           0  v1079014846    1.1.3.1   298728.0   
3        300        units           0  v1079014853    1.1.4.1   530754.0   
4        300        units           0  v1079014860    1.1.5.1   125964.0   
...      ...           ...        ...          ...        ...        ...   
3460     300         units          0  v1079015188   11.7.1.1       32.0   
3461     300         units          0  v1079015195   11.7.2.1       16.0   
3462     300         units          0  v1079015202   11.7.3.1        0.0   
3463     300         units          0  v1079015209   11.7.4.1        7.0   
3464     300         units          0  v1079015216   11.7.5.1        9.0   

     STATUS  SYMBOL  TERMINATED  DECIMALS  
0       NaN     NaN         NaN         0  
1       NaN     NaN         NaN         0  
2       NaN     NaN         NaN         0  
3       NaN     NaN         NaN         0  
4       NaN     NaN         NaN         0  
...     ...     ...         ...       ...  
3460    NaN     NaN         NaN         0  
3461    NaN     NaN         NaN         0  
3462    NaN     NaN         NaN         0  
3463    NaN     NaN         NaN         0  
3464    NaN     NaN         NaN         0  

[3465 rows x 17 columns]>
Battery Electric Vehicles First & Last 5 rows:
<bound method NDFrame.head of     Model        Make                               Model.1  Vehicle Class  \
0    Year         NaN                                   NaN            NaN   
1    2012  Mitsubishi                                i-MiEV     Subcompact   
2    2012      Nissan                                  LEAF       Mid-size   
3    2013        Ford                        Focus Electric        Compact   
4    2013  Mitsubishi                                i-MiEV     Subcompact   
..    ...         ...                                   ...            ...   
174  2020       Tesla      Model X Performance (22" Wheels)  SUV: Standard   
175  2020       Tesla                Model Y Long Range AWD     SUV: Small   
176  2020       Tesla               Model Y Performance AWD     SUV: Small   
177  2020       Tesla  Model Y Performance AWD (21" Wheels)     SUV: Small   
178  2020  Volkswagen                                e-Golf        Compact   

    Motor Transmission  Fuel        Consumption        Unnamed: 8  \
0    (kW)          NaN  Type  City (kWh/100 km)  Hwy (kWh/100 km)   
1      49           A1     B               16.9              21.4   
2      80           A1     B               19.3                23   
3     107           A1     B                 19              21.1   
4      49           A1     B               16.9              21.4   
..    ...          ...   ...                ...               ...   
174   580           A1     B               26.2              27.1   
175   358           A1     B               16.5              18.3   
176   358           A1     B               16.3              18.7   
177   358           A1     B               18.1              19.8   
178   100           A1     B               17.4              19.9   

            Unnamed: 9       Unnamed: 10      Unnamed: 11       Unnamed: 12  \
0    Comb (kWh/100 km)  City (Le/100 km)  Hwy (Le/100 km)  Comb (Le/100 km)   
1                 18.7               1.9              2.4               2.1   
2                 21.1               2.2              2.6               2.4   
3                   20               2.1              2.4               2.2   
4                 18.7               1.9              2.4               2.1   
..                 ...               ...              ...               ...   
174               26.6               2.9                3                 3   
175               17.3               1.9              2.1               1.9   
176               17.4               1.8              2.1                 2   
177               18.9                 2              2.2               2.1   
178               18.6               1.9              2.3               2.1   

    Range CO2 Emissions    CO2     Smog  Recharge  
0    (km)        (g/km)  Rating  Rating  Time (h)  
1     100             0     NaN     NaN         7  
2     117             0     NaN     NaN         7  
3     122             0     NaN     NaN         4  
4     100             0     NaN     NaN         7  
..    ...           ...     ...     ...       ...  
174   438             0      10      10        12  
175   509             0      10      10        10  
176   507             0      10      10        10  
177   468             0      10      10        10  
178   198             0      10      10       5.3  

[179 rows x 18 columns]>
Plug in Hybrid Electric Vehicles First & Last 5 rows:
<bound method NDFrame.head of     Model       Make        Model.1         Vehicle Class Motor Engine Size  \
0    Year        NaN            NaN                   NaN  (kW)         (L)   
1    2012  Chevrolet           Volt               Compact   111         1.4   
2    2013  Chevrolet           Volt               Compact   111         1.4   
3    2013       Ford   C-MAX Energi              Mid-size    35           2   
4    2013       Ford  Fusion Energi              Mid-size    35           2   
..    ...        ...            ...                   ...   ...         ...   
132  2020      Volvo     S60 T8 AWD               Compact    65           2   
133  2020      Volvo     S90 T8 AWD              Mid-size    65           2   
134  2020      Volvo     V60 T8 AWD  Station wagon: Small    65           2   
135  2020      Volvo    XC60 T8 AWD            SUV: Small    65           2   
136  2020      Volvo    XC90 T8 AWD         SUV: Standard    65           2   

     Cylinders Transmission    Fuel                      Consumption Range 1  \
0          NaN          NaN  Type 1               Combined Le/100 km    (km)   
1          4.0           AV       B            2.5 (22.3 kWh/100 km)      56   
2          4.0           AV       B            2.4 (21.4 kWh/100 km)      61   
3          4.0           AV     B/X  2.7 ([23.2 kWh + 0.1 L]/100 km)      32   
4          4.0           AV     B/X  2.7 ([23.2 kWh + 0.1 L]/100 km)      32   
..         ...          ...     ...                              ...     ...   
132        4.0          AS8    B/Z*  3.2 ([29.0 kWh + 0.0 L]/100 km)      35   
133        4.0          AS8     B/Z  3.7 ([31.4 kWh + 0.2 L]/100 km)      34   
134        4.0          AS8    B/Z*  3.2 ([29.0 kWh + 0.0 L]/100 km)      35   
135        4.0          AS8    B/Z*  4.2 ([38.2 kWh + 0.0 L]/100 km)      27   
136        4.0          AS8    B/Z*  4.0 ([36.1 kWh + 0.0 L]/100 km)      29   

     Recharge  Fuel.1    Consumption.1     Unnamed: 14      Unnamed: 15  \
0    Time (h)  Type 2  City (L/100 km)  Hwy (L/100 km)  Comb (L/100 km)   
1           4       Z              6.7             5.9              6.4   
2           4       Z              6.7             5.9              6.4   
3         2.5       X              5.8             6.5              6.1   
4         2.5       X              5.8             6.5              6.1   
..        ...     ...              ...             ...              ...   
132         3       Z              8.4               7              7.8   
133         3       Z              8.3             7.5              7.9   
134         3       Z              8.4               7              7.8   
135         3       Z              9.5             8.7              9.1   
136         3       Z              9.1             8.4              8.8   

    Range 2 CO2 Emissions     CO2    Smog  
0      (km)        (g/km)  Rating  Rating  
1       550            54     NaN     NaN  
2       550            45     NaN     NaN  
3       856            80     NaN     NaN  
4       856            80     NaN     NaN  
..      ...           ...     ...     ...  
132     781            94      10       7  
133     761            99      10       7  
134     781            94      10       7  
135     781           128       9       7  
136     813           120      10       7  

[137 rows x 20 columns]>

Descriptive Statistics

In [7]:
vs.describe(include='all')

# View statistical details.
Out[7]:
REF_DATE GEO DGUID Fuel Type Vehicle type Statistics UOM UOM_ID SCALAR_FACTOR SCALAR_ID VECTOR COORDINATE VALUE STATUS SYMBOL TERMINATED DECIMALS
count 3465.000000 3465 3465 3465 3465 3465 3465 3465.0 3465 3465.0 3465 3465 2.520000e+03 945 0.0 0.0 3465.0
unique NaN 11 11 7 5 1 1 NaN 2 NaN 385 385 NaN 1 NaN NaN NaN
top NaN Quebec 2016A000259 All fuel types Vans Number of vehicles Units NaN units NaN v1079014942 11.1.2.1 NaN .. NaN NaN NaN
freq NaN 315 315 495 693 3465 3465 NaN 3080 NaN 9 9 NaN 945 NaN NaN NaN
mean 2015.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 4.794399e+04 NaN NaN NaN 0.0
std 2.582362 NaN NaN NaN NaN NaN NaN 0.0 NaN 0.0 NaN NaN 1.873545e+05 NaN NaN NaN 0.0
min 2011.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 0.000000e+00 NaN NaN NaN 0.0
25% 2013.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 0.000000e+00 NaN NaN NaN 0.0
50% 2015.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 2.430000e+02 NaN NaN NaN 0.0
75% 2017.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 1.065150e+04 NaN NaN NaN 0.0
max 2019.000000 NaN NaN NaN NaN NaN NaN 300.0 NaN 0.0 NaN NaN 2.033597e+06 NaN NaN NaN 0.0
In [8]:
ev.describe(include='all')

# View statistical details.
Out[8]:
Model Make Model.1 Vehicle Class Motor Transmission Fuel Consumption Unnamed: 8 Unnamed: 9 Unnamed: 10 Unnamed: 11 Unnamed: 12 Range CO2 Emissions CO2 Smog Recharge
count 179 178 178 178 179 178 179 179 179 179 179 179 179 179 179 145 118 179
unique 10 16 80 8 32 2 2 59 54 56 18 14 17 77 2 2 2 19
top 2020 Tesla i-MiEV Full-size 386 A1 B 16.9 22.5 19.6 1.9 2.3 2.2 109 0 10 10 12
freq 38 90 6 52 24 177 178 11 10 10 44 30 29 8 178 144 117 62
In [9]:
phev.describe(include='all')

# View statistical details.
Out[9]:
Model Make Model.1 Vehicle Class Motor Engine Size Cylinders Transmission Fuel Consumption Range 1 Recharge Fuel.1 Consumption.1 Unnamed: 14 Unnamed: 15 Range 2 CO2 Emissions CO2 Smog
count 137 136 136 136 137 137 136.000000 136 137 137 137 137 137 137 137 137 137 137 117 101
unique 10 19 54 9 26 13 NaN 13 6 74 33 20 3 54 45 46 72 63 5 7
top 2020 BMW Fusion Energi Mid-size 65 2 NaN AV B/Z* 2.7 ([23.2 kWh + 0.1 L]/100 km) 23 3 Z 8.4 8 5.6 856 80 10 7
freq 30 33 8 41 19 61 NaN 42 49 8 17 44 87 9 10 11 8 8 80 52
mean NaN NaN NaN NaN NaN NaN 4.205882 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
std NaN NaN NaN NaN NaN NaN 1.142561 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
min NaN NaN NaN NaN NaN NaN 2.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
25% NaN NaN NaN NaN NaN NaN 4.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN 4.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN 4.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
max NaN NaN NaN NaN NaN NaN 8.000000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

Data cleanup (handling missing values, duplicates, errors, outliers)

In [10]:
vs.isnull().values.any()  # Any null values?
Out[10]:
True
In [11]:
ev.isnull().values.any()  # Any null values?
Out[11]:
True
In [12]:
phev.isnull().values.any()  # Any null values?
Out[12]:
True
In [13]:
vs.isnull().sum()  # Where are the null values?
Out[13]:
REF_DATE            0
GEO                 0
DGUID               0
Fuel Type           0
Vehicle type        0
Statistics          0
UOM                 0
UOM_ID              0
SCALAR_FACTOR       0
SCALAR_ID           0
VECTOR              0
COORDINATE          0
VALUE             945
STATUS           2520
SYMBOL           3465
TERMINATED       3465
DECIMALS            0
dtype: int64
In [14]:
ev.isnull().sum()  # Where are the null values?
Out[14]:
Model             0
Make              1
Model.1           1
Vehicle Class     1
Motor             0
Transmission      1
Fuel              0
Consumption       0
Unnamed: 8        0
Unnamed: 9        0
Unnamed: 10       0
Unnamed: 11       0
Unnamed: 12       0
Range             0
CO2 Emissions     0
CO2              34
Smog             61
Recharge          0
dtype: int64
In [15]:
phev.isnull().sum()  # Where are the null values?
Out[15]:
Model             0
Make              1
Model.1           1
Vehicle Class     1
Motor             0
Engine Size       0
Cylinders         1
Transmission      1
Fuel              0
Consumption       0
Range 1           0
Recharge          0
Fuel.1            0
Consumption.1     0
Unnamed: 14       0
Unnamed: 15       0
Range 2           0
CO2 Emissions     0
CO2              20
Smog             36
dtype: int64
In [16]:
vs.duplicated().any()  # Any duplicated values?
Out[16]:
False
In [17]:
ev.duplicated().any()  # Any duplicated values?
Out[17]:
False
In [18]:
phev.duplicated().any()  # Any duplicated values?
Out[18]:
False
In [19]:
vs.info()

#  Information about the data including the index, data type in columns.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3465 entries, 0 to 3464
Data columns (total 17 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   REF_DATE       3465 non-null   int64  
 1   GEO            3465 non-null   object 
 2   DGUID          3465 non-null   object 
 3   Fuel Type      3465 non-null   object 
 4   Vehicle type   3465 non-null   object 
 5   Statistics     3465 non-null   object 
 6   UOM            3465 non-null   object 
 7   UOM_ID         3465 non-null   int64  
 8   SCALAR_FACTOR  3465 non-null   object 
 9   SCALAR_ID      3465 non-null   int64  
 10  VECTOR         3465 non-null   object 
 11  COORDINATE     3465 non-null   object 
 12  VALUE          2520 non-null   float64
 13  STATUS         945 non-null    object 
 14  SYMBOL         0 non-null      float64
 15  TERMINATED     0 non-null      float64
 16  DECIMALS       3465 non-null   int64  
dtypes: float64(3), int64(4), object(10)
memory usage: 460.3+ KB
In [20]:
ev.info()

#  Information about the data including the index, data type in columns.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 179 entries, 0 to 178
Data columns (total 18 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   Model          179 non-null    object
 1   Make           178 non-null    object
 2   Model.1        178 non-null    object
 3   Vehicle Class  178 non-null    object
 4   Motor          179 non-null    object
 5   Transmission   178 non-null    object
 6   Fuel           179 non-null    object
 7   Consumption    179 non-null    object
 8   Unnamed: 8     179 non-null    object
 9   Unnamed: 9     179 non-null    object
 10  Unnamed: 10    179 non-null    object
 11  Unnamed: 11    179 non-null    object
 12  Unnamed: 12    179 non-null    object
 13  Range          179 non-null    object
 14  CO2 Emissions  179 non-null    object
 15  CO2            145 non-null    object
 16  Smog           118 non-null    object
 17  Recharge       179 non-null    object
dtypes: object(18)
memory usage: 25.3+ KB
In [21]:
phev.info()

#  Information about the data including the index, data type in columns.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 137 entries, 0 to 136
Data columns (total 20 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Model          137 non-null    object 
 1   Make           136 non-null    object 
 2   Model.1        136 non-null    object 
 3   Vehicle Class  136 non-null    object 
 4   Motor          137 non-null    object 
 5   Engine Size    137 non-null    object 
 6   Cylinders      136 non-null    float64
 7   Transmission   136 non-null    object 
 8   Fuel           137 non-null    object 
 9   Consumption    137 non-null    object 
 10  Range 1        137 non-null    object 
 11  Recharge       137 non-null    object 
 12  Fuel.1         137 non-null    object 
 13  Consumption.1  137 non-null    object 
 14  Unnamed: 14    137 non-null    object 
 15  Unnamed: 15    137 non-null    object 
 16  Range 2        137 non-null    object 
 17  CO2 Emissions  137 non-null    object 
 18  CO2            117 non-null    object 
 19  Smog           101 non-null    object 
dtypes: float64(1), object(19)
memory usage: 21.5+ KB
In [22]:
vs.describe()  # Looking for outliers.
Out[22]:
REF_DATE UOM_ID SCALAR_ID VALUE SYMBOL TERMINATED DECIMALS
count 3465.000000 3465.0 3465.0 2.520000e+03 0.0 0.0 3465.0
mean 2015.000000 300.0 0.0 4.794399e+04 NaN NaN 0.0
std 2.582362 0.0 0.0 1.873545e+05 NaN NaN 0.0
min 2011.000000 300.0 0.0 0.000000e+00 NaN NaN 0.0
25% 2013.000000 300.0 0.0 0.000000e+00 NaN NaN 0.0
50% 2015.000000 300.0 0.0 2.430000e+02 NaN NaN 0.0
75% 2017.000000 300.0 0.0 1.065150e+04 NaN NaN 0.0
max 2019.000000 300.0 0.0 2.033597e+06 NaN NaN 0.0
In [23]:
ev.describe()  # Looking for outliers.
Out[23]:
Model Make Model.1 Vehicle Class Motor Transmission Fuel Consumption Unnamed: 8 Unnamed: 9 Unnamed: 10 Unnamed: 11 Unnamed: 12 Range CO2 Emissions CO2 Smog Recharge
count 179 178 178 178 179 178 179 179 179 179 179 179 179 179 179 145 118 179
unique 10 16 80 8 32 2 2 59 54 56 18 14 17 77 2 2 2 19
top 2020 Tesla i-MiEV Full-size 386 A1 B 16.9 22.5 19.6 1.9 2.3 2.2 109 0 10 10 12
freq 38 90 6 52 24 177 178 11 10 10 44 30 29 8 178 144 117 62
In [24]:
phev.describe()  # Looking for outliers.
Out[24]:
Cylinders
count 136.000000
mean 4.205882
std 1.142561
min 2.000000
25% 4.000000
50% 4.000000
75% 4.000000
max 8.000000

Let's Clean

Vehicle Sales Cleanup

In [25]:
vs
Out[25]:
REF_DATE GEO DGUID Fuel Type Vehicle type Statistics UOM UOM_ID SCALAR_FACTOR SCALAR_ID VECTOR COORDINATE VALUE STATUS SYMBOL TERMINATED DECIMALS
0 2011 Canada 2016A000011124 All fuel types Total, vehicle type Number of vehicles Units 300 units 0 v1079014832 1.1.1.1 1623288.0 NaN NaN NaN 0
1 2011 Canada 2016A000011124 All fuel types Passenger cars Number of vehicles Units 300 units 0 v1079014839 1.1.2.1 667842.0 NaN NaN NaN 0
2 2011 Canada 2016A000011124 All fuel types Pick up trucks Number of vehicles Units 300 units 0 v1079014846 1.1.3.1 298728.0 NaN NaN NaN 0
3 2011 Canada 2016A000011124 All fuel types Multi-purpose vehicles Number of vehicles Units 300 units 0 v1079014853 1.1.4.1 530754.0 NaN NaN NaN 0
4 2011 Canada 2016A000011124 All fuel types Vans Number of vehicles Units 300 units 0 v1079014860 1.1.5.1 125964.0 NaN NaN NaN 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3460 2019 British Columbia and the Territories 2016A000259 Other fuel types Total, vehicle type Number of vehicles Units 300 units 0 v1079015188 11.7.1.1 32.0 NaN NaN NaN 0
3461 2019 British Columbia and the Territories 2016A000259 Other fuel types Passenger cars Number of vehicles Units 300 units 0 v1079015195 11.7.2.1 16.0 NaN NaN NaN 0
3462 2019 British Columbia and the Territories 2016A000259 Other fuel types Pick up trucks Number of vehicles Units 300 units 0 v1079015202 11.7.3.1 0.0 NaN NaN NaN 0
3463 2019 British Columbia and the Territories 2016A000259 Other fuel types Multi-purpose vehicles Number of vehicles Units 300 units 0 v1079015209 11.7.4.1 7.0 NaN NaN NaN 0
3464 2019 British Columbia and the Territories 2016A000259 Other fuel types Vans Number of vehicles Units 300 units 0 v1079015216 11.7.5.1 9.0 NaN NaN NaN 0

3465 rows × 17 columns

In [26]:
vs.fillna(0, inplace=True)

# Replace all blank spaces with zero or dropping the rows(to make visualization simple).
# Because replacing blank spaces with the mean value will make the sales count inaccurate.
In [27]:
vs.drop([
    'DGUID', 'Statistics', 'UOM', 'UOM_ID', 'SCALAR_FACTOR', 'SCALAR_ID',
    'VECTOR', 'COORDINATE', 'STATUS', 'SYMBOL', 'TERMINATED', 'DECIMALS'
],
        axis=1,
        inplace=True)

# Drop most of the columns as there is no difference or effect on our data.
# Keeping only the number of units sold per vehicle and fuel type in certain locations from 2011 to 2019.
In [28]:
vs.columns  # View all Columns names before renaming.
Out[28]:
Index(['REF_DATE', 'GEO', 'Fuel Type', 'Vehicle type', 'VALUE'], dtype='object')
In [29]:
vs.columns = [
    'Date', 'Location', 'Fuel_Type', 'Vehicle_type', 'Number_of_Cars_sold'
]

# Changing the columns names to be more clear.
In [30]:
vs = vs[(vs['Fuel_Type'] != 'Diesel') & (vs['Fuel_Type'] != 'All fuel types')]
vs = vs[(vs['Fuel_Type'] != 'Other fuel types')
        & (vs['Fuel_Type'] != 'Gasoline')]

# Drop some rows with inefficient fuel types(gasoline and diesel) and other fuel types(liquid propane).
In [31]:
vs  # Let's have a look....is it readable now?
Out[31]:
Date Location Fuel_Type Vehicle_type Number_of_Cars_sold
15 2011 Canada Battery electric Total, vehicle type 215.0
16 2011 Canada Battery electric Passenger cars 198.0
17 2011 Canada Battery electric Pick up trucks 0.0
18 2011 Canada Battery electric Multi-purpose vehicles 0.0
19 2011 Canada Battery electric Vans 17.0
... ... ... ... ... ...
3455 2019 British Columbia and the Territories Plug-in hybrid electric Total, vehicle type 4466.0
3456 2019 British Columbia and the Territories Plug-in hybrid electric Passenger cars 2551.0
3457 2019 British Columbia and the Territories Plug-in hybrid electric Pick up trucks 0.0
3458 2019 British Columbia and the Territories Plug-in hybrid electric Multi-purpose vehicles 1713.0
3459 2019 British Columbia and the Territories Plug-in hybrid electric Vans 202.0

1485 rows × 5 columns

Electric Vehicles Cleanup

In [32]:
ev = pd.read_csv('2012-2020_Battery_Electric_Vehicles.csv', header=[0, 1])
ev.columns = ev.columns.map('_'.join)

# Combining the headers with the first row.
In [33]:
ev.replace(NaN, 0)  # Replace all NAN with 0.
Out[33]:
Model_Year Make_Unnamed: 1_level_1 Model_Unnamed: 2_level_1 Vehicle Class_Unnamed: 3_level_1 Motor_(kW) Transmission_Unnamed: 5_level_1 Fuel_Type Consumption_City (kWh/100 km) Unnamed: 8_level_0_Hwy (kWh/100 km) Unnamed: 9_level_0_Comb (kWh/100 km) Unnamed: 10_level_0_City (Le/100 km) Unnamed: 11_level_0_Hwy (Le/100 km) Unnamed: 12_level_0_Comb (Le/100 km) Range_(km) CO2 Emissions_(g/km) CO2 _Rating Smog_Rating Recharge_Time (h)
0 2012 Mitsubishi i-MiEV Subcompact 49 A1 B 16.9 21.4 18.7 1.9 2.4 2.1 100 0 0.0 0.0 7.0
1 2012 Nissan LEAF Mid-size 80 A1 B 19.3 23.0 21.1 2.2 2.6 2.4 117 0 0.0 0.0 7.0
2 2013 Ford Focus Electric Compact 107 A1 B 19.0 21.1 20.0 2.1 2.4 2.2 122 0 0.0 0.0 4.0
3 2013 Mitsubishi i-MiEV Subcompact 49 A1 B 16.9 21.4 18.7 1.9 2.4 2.1 100 0 0.0 0.0 7.0
4 2013 Nissan LEAF Mid-size 80 A1 B 19.3 23.0 21.1 2.2 2.6 2.4 117 0 0.0 0.0 7.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
173 2020 Tesla Model X Performance (22" Wheels) SUV: Standard 580 A1 B 26.2 27.1 26.6 2.9 3.0 3.0 438 0 10.0 10.0 12.0
174 2020 Tesla Model Y Long Range AWD SUV: Small 358 A1 B 16.5 18.3 17.3 1.9 2.1 1.9 509 0 10.0 10.0 10.0
175 2020 Tesla Model Y Performance AWD SUV: Small 358 A1 B 16.3 18.7 17.4 1.8 2.1 2.0 507 0 10.0 10.0 10.0
176 2020 Tesla Model Y Performance AWD (21" Wheels) SUV: Small 358 A1 B 18.1 19.8 18.9 2.0 2.2 2.1 468 0 10.0 10.0 10.0
177 2020 Volkswagen e-Golf Compact 100 A1 B 17.4 19.9 18.6 1.9 2.3 2.1 198 0 10.0 10.0 5.3

178 rows × 18 columns

In [34]:
ev.columns  # View all Columns names before renaming.
Out[34]:
Index(['Model_Year', 'Make_Unnamed: 1_level_1', 'Model_Unnamed: 2_level_1',
       'Vehicle Class_Unnamed: 3_level_1', 'Motor_(kW)',
       'Transmission_Unnamed: 5_level_1', 'Fuel_Type',
       'Consumption_City (kWh/100 km)', 'Unnamed: 8_level_0_Hwy (kWh/100 km)',
       'Unnamed: 9_level_0_Comb (kWh/100 km)',
       'Unnamed: 10_level_0_City (Le/100 km)',
       'Unnamed: 11_level_0_Hwy (Le/100 km)',
       'Unnamed: 12_level_0_Comb (Le/100 km)', 'Range_(km)',
       'CO2 Emissions_(g/km)', 'CO2 _Rating', 'Smog_Rating',
       'Recharge_Time (h)'],
      dtype='object')
In [35]:
ev.columns = [
    'Year', 'Make', 'Model', 'Vehicle Class', 'Motor(kW)', 'Transmission',
    'Fuel Type', 'City(kWh/100 km)', 'Highway(kWh/100 km)',
    'City+Highway(kWh/100 km)', 'City(Litre/100 km)', 'Highway(Litre/100 km)',
    'City+Highway(Litre/100 km)', 'Range(km)', 'CO2 Emissions(g/km)',
    'CO2 Rating', 'Smog Rating', 'Recharge_Time(hours)'
]

# Changing the columns names to be more clear.
In [36]:
ev.drop([
    'Transmission', 'Fuel Type', 'City(Litre/100 km)', 'Highway(Litre/100 km)',
    'City+Highway(Litre/100 km)', 'CO2 Emissions(g/km)', 'CO2 Rating',
    'Smog Rating'
],
        axis=1,
        inplace=True)

# Drop Transmission, Fuel Type(All cars are electirc), Anything in Litre, all emissions as there is no difference or effect on our data.
In [37]:
ev  # Let's have a look....is it readable now?
Out[37]:
Year Make Model Vehicle Class Motor(kW) City(kWh/100 km) Highway(kWh/100 km) City+Highway(kWh/100 km) Range(km) Recharge_Time(hours)
0 2012 Mitsubishi i-MiEV Subcompact 49 16.9 21.4 18.7 100 7.0
1 2012 Nissan LEAF Mid-size 80 19.3 23.0 21.1 117 7.0
2 2013 Ford Focus Electric Compact 107 19.0 21.1 20.0 122 4.0
3 2013 Mitsubishi i-MiEV Subcompact 49 16.9 21.4 18.7 100 7.0
4 2013 Nissan LEAF Mid-size 80 19.3 23.0 21.1 117 7.0
... ... ... ... ... ... ... ... ... ... ...
173 2020 Tesla Model X Performance (22" Wheels) SUV: Standard 580 26.2 27.1 26.6 438 12.0
174 2020 Tesla Model Y Long Range AWD SUV: Small 358 16.5 18.3 17.3 509 10.0
175 2020 Tesla Model Y Performance AWD SUV: Small 358 16.3 18.7 17.4 507 10.0
176 2020 Tesla Model Y Performance AWD (21" Wheels) SUV: Small 358 18.1 19.8 18.9 468 10.0
177 2020 Volkswagen e-Golf Compact 100 17.4 19.9 18.6 198 5.3

178 rows × 10 columns

Plug-in Hybrid Electric Vehicles Cleanup

In [38]:
phev
Out[38]:
Model Make Model.1 Vehicle Class Motor Engine Size Cylinders Transmission Fuel Consumption Range 1 Recharge Fuel.1 Consumption.1 Unnamed: 14 Unnamed: 15 Range 2 CO2 Emissions CO2 Smog
0 Year NaN NaN NaN (kW) (L) NaN NaN Type 1 Combined Le/100 km (km) Time (h) Type 2 City (L/100 km) Hwy (L/100 km) Comb (L/100 km) (km) (g/km) Rating Rating
1 2012 Chevrolet Volt Compact 111 1.4 4.0 AV B 2.5 (22.3 kWh/100 km) 56 4 Z 6.7 5.9 6.4 550 54 NaN NaN
2 2013 Chevrolet Volt Compact 111 1.4 4.0 AV B 2.4 (21.4 kWh/100 km) 61 4 Z 6.7 5.9 6.4 550 45 NaN NaN
3 2013 Ford C-MAX Energi Mid-size 35 2 4.0 AV B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.5 X 5.8 6.5 6.1 856 80 NaN NaN
4 2013 Ford Fusion Energi Mid-size 35 2 4.0 AV B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.5 X 5.8 6.5 6.1 856 80 NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
132 2020 Volvo S60 T8 AWD Compact 65 2 4.0 AS8 B/Z* 3.2 ([29.0 kWh + 0.0 L]/100 km) 35 3 Z 8.4 7 7.8 781 94 10 7
133 2020 Volvo S90 T8 AWD Mid-size 65 2 4.0 AS8 B/Z 3.7 ([31.4 kWh + 0.2 L]/100 km) 34 3 Z 8.3 7.5 7.9 761 99 10 7
134 2020 Volvo V60 T8 AWD Station wagon: Small 65 2 4.0 AS8 B/Z* 3.2 ([29.0 kWh + 0.0 L]/100 km) 35 3 Z 8.4 7 7.8 781 94 10 7
135 2020 Volvo XC60 T8 AWD SUV: Small 65 2 4.0 AS8 B/Z* 4.2 ([38.2 kWh + 0.0 L]/100 km) 27 3 Z 9.5 8.7 9.1 781 128 9 7
136 2020 Volvo XC90 T8 AWD SUV: Standard 65 2 4.0 AS8 B/Z* 4.0 ([36.1 kWh + 0.0 L]/100 km) 29 3 Z 9.1 8.4 8.8 813 120 10 7

137 rows × 20 columns

In [39]:
phev = pd.read_csv('2012-2020_Plug-in_Hybrid_Electric_Vehicles.csv',
                   header=[0, 1])
phev.columns = phev.columns.map('_'.join)

# Combining the headers with the first row.
In [40]:
phev.columns = [
    'Year', 'Make', 'Model', 'Vehicle Class', 'Motor(kW)',
    'Engine Size(Litre)', 'Cylinders', 'Transmission',
    'Fuel Type(Electricity/Gasoline)', 'Electricity/Gasoline(100km)',
    'Range(km)', 'Recharge Time(hours)', 'Fuel Type 2(Gasoline only)',
    'City(Litre/100km)', 'Highway(Litre/100km)', 'Highway&City(Litre/100km)',
    'Combined_Range(km)', 'CO2 Emissions(g/km)', 'CO2 Rating', 'Smog Rating'
]

# Changing the columns names to be more clear.
In [41]:
phev.drop(['Transmission'], axis=1, inplace=True)

# Drop Transmission as there is no difference or effect on our data.
In [42]:
phev['CO2 Rating'].fillna(0, inplace=True)

# Replace all NAN with 0.
In [43]:
phev['Smog Rating'].fillna(0, inplace=True)

# Replace all NAN with 0.
In [44]:
phev  # Let's have a look....is it readable now?
Out[44]:
Year Make Model Vehicle Class Motor(kW) Engine Size(Litre) Cylinders Fuel Type(Electricity/Gasoline) Electricity/Gasoline(100km) Range(km) Recharge Time(hours) Fuel Type 2(Gasoline only) City(Litre/100km) Highway(Litre/100km) Highway&City(Litre/100km) Combined_Range(km) CO2 Emissions(g/km) CO2 Rating Smog Rating
0 2012 Chevrolet Volt Compact 111 1.4 4 B 2.5 (22.3 kWh/100 km) 56 4.0 Z 6.7 5.9 6.4 550 54 0.0 0.0
1 2013 Chevrolet Volt Compact 111 1.4 4 B 2.4 (21.4 kWh/100 km) 61 4.0 Z 6.7 5.9 6.4 550 45 0.0 0.0
2 2013 Ford C-MAX Energi Mid-size 35 2.0 4 B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.5 X 5.8 6.5 6.1 856 80 0.0 0.0
3 2013 Ford Fusion Energi Mid-size 35 2.0 4 B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.5 X 5.8 6.5 6.1 856 80 0.0 0.0
4 2013 Toyota Prius Plug-in Hybrid Mid-size 60 1.8 4 B/X 2.5 ([18.0 kWh + 0.4 L]/100 km) 18 1.5 X 4.7 4.8 4.7 845 101 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
131 2020 Volvo S60 T8 AWD Compact 65 2.0 4 B/Z* 3.2 ([29.0 kWh + 0.0 L]/100 km) 35 3.0 Z 8.4 7.0 7.8 781 94 10.0 7.0
132 2020 Volvo S90 T8 AWD Mid-size 65 2.0 4 B/Z 3.7 ([31.4 kWh + 0.2 L]/100 km) 34 3.0 Z 8.3 7.5 7.9 761 99 10.0 7.0
133 2020 Volvo V60 T8 AWD Station wagon: Small 65 2.0 4 B/Z* 3.2 ([29.0 kWh + 0.0 L]/100 km) 35 3.0 Z 8.4 7.0 7.8 781 94 10.0 7.0
134 2020 Volvo XC60 T8 AWD SUV: Small 65 2.0 4 B/Z* 4.2 ([38.2 kWh + 0.0 L]/100 km) 27 3.0 Z 9.5 8.7 9.1 781 128 9.0 7.0
135 2020 Volvo XC90 T8 AWD SUV: Standard 65 2.0 4 B/Z* 4.0 ([36.1 kWh + 0.0 L]/100 km) 29 3.0 Z 9.1 8.4 8.8 813 120 10.0 7.0

136 rows × 19 columns

In [45]:
vs.to_csv('vs.csv')
ev.to_csv('ev.csv')
phev.to_csv('phev.csv')

# Saving the cleaned data for visulaization.

Data Visualization

Vehicle Sales Visualization

In [46]:
vsp = pandas_profiling.ProfileReport(vs)

# Pandas Profiling for Vehicle Sales
In [47]:
vsp.to_notebook_iframe()

# Pandas profiling interactive animation on the notebook(Html viewing)



In [48]:
sns.heatmap(vs.corr())
plt.title('Vehicle Sales');

# Visulazing the correlation on a heatmap.
In [49]:
sns.barplot(x="Number_of_Cars_sold", y="Location", hue="Fuel_Type", data=vs)
plt.xlabel('Units Sold')
plt.ylabel('Province')
plt.title('Number of cars sold in each location across Canada');

# Plotting using the barplot (How many car sold in each province?)

1.PNG

In [50]:
sns.pointplot(x="Number_of_Cars_sold", y="Location", hue="Fuel_Type", data=vs)
plt.xlabel('Units Sold')
plt.ylabel('Province')
plt.title('Number of cars sold in each location across Canada');

#Plotting on a pointplot (How many car sold in each province?)
In [51]:
plt.scatter(x=vs['Number_of_Cars_sold'],\
            y=vs['Fuel_Type'],\
            alpha=0.5,\
            s=50,\
            color='g',\
            marker='x')
plt.xlabel('Units Sold')
plt.ylabel('Fuel Type')
plt.title('Number of cars sold per fuel type');

# Matplotlib (How many cars sold for each fuel type?)

2.PNG

In [52]:
# The whole data on a scatter matrix.
scatter_matrix(vs, alpha=0.5, figsize=(12, 12), diagonal='kde', s=100)
plt.title('Number of cars sold per year');

# Scatter Matrix (How many cars sold per calendar year?)
In [53]:
sns.factorplot(x="Number_of_Cars_sold",
               y="Vehicle_type",
               hue="Fuel_Type",
               data=vs,
               col="Date",
               kind="box",
               height=3.5,
               aspect=1.5)
plt.title('Cars sold by vehicle and fuel type over the years');

# Plotting using factorplot with specific size and aspect.(Number of cars sold by vehicle type per calendar year)
C:\Users\Kareem.DESKTOP-ABP3L41\.conda\envs\python38\lib\site-packages\seaborn\categorical.py:3666: UserWarning: The `factorplot` function has been renamed to `catplot`. The original name will be removed in a future release. Please update your code. Note that the default `kind` in `factorplot` (`'point'`) has changed `'strip'` in `catplot`.
  warnings.warn(msg)
In [54]:
g = sns.PairGrid(vs,
                 x_vars=["Number_of_Cars_sold", "Date"],
                 y_vars=["Fuel_Type", "Location"],
                 aspect=1.75,
                 height=2.25)
g.map(sns.violinplot, palette="Blues_r")
plt.title('Cars sold over the years for different fuel type among Canada');

# Showing categorical relationship on a pair grid & a violinplot.

Electric Vehicles Visualization

In [55]:
pev = pandas_profiling.ProfileReport(ev)

# Pandas Profiling for Electric Vehicle.
In [56]:
pev.to_notebook_iframe()

# Pandas profiling interactive animation on the notebook(Html viewing)



In [57]:
sns.heatmap(ev.corr())
plt.title('Electric vehicle')

# Visulazing the correlation on a heatmap.
Out[57]:
Text(0.5, 1.0, 'Electric vehicle')
In [58]:
plt.scatter(x=ev['Range(km)'],\
            y=ev['Recharge_Time(hours)'],\
            alpha=0.5,\
            s=50,\
            color='b',\
            marker='x')
plt.xlabel('Range per Km')
plt.ylabel('Charging Time hourly')
plt.title('Range by Km for hourly charging')

# Matplotlib (How long can the car go with one charge?)
Out[58]:
Text(0.5, 1.0, 'Range by Km for hourly charging')
In [59]:
sns.factorplot(x="Recharge_Time(hours)",
               y="Range(km)",
               data=ev,
               col="Make",
               kind="box",
               height=3.5,
               aspect=1.5)
plt.title('Range by Km for hourly charging for each car make');

# Plotting using factorplot with specific size and aspect.(How long can each model go with one charge?)
C:\Users\Kareem.DESKTOP-ABP3L41\.conda\envs\python38\lib\site-packages\seaborn\categorical.py:3666: UserWarning: The `factorplot` function has been renamed to `catplot`. The original name will be removed in a future release. Please update your code. Note that the default `kind` in `factorplot` (`'point'`) has changed `'strip'` in `catplot`.
  warnings.warn(msg)

3.PNG

Plug-in Hybrid Electric Vehicles Visualization

In [60]:
sns.heatmap(phev.corr())
plt.title('Plug-in Hybrid Electric Vehicle')

# Visulazing the correlation on a heatmap.
Out[60]:
Text(0.5, 1.0, 'Plug-in Hybrid Electric Vehicle')
In [61]:
sns.barplot(x="Combined_Range(km)", y="Make", data=phev)
plt.xlabel('Range by km for hybrid vehicles')
plt.ylabel('Make of the car')
plt.title('Range by make of Hybrid cars');

# Plotting using the barplot (Which car can go longer?)

4.PNG

Train/Test Split

In [62]:
train1, test1 = model_selection.train_test_split(vs)
In [63]:
train2, test2 = model_selection.train_test_split(ev)
In [64]:
train3, test3 = model_selection.train_test_split(phev)
In [65]:
len(test1) / (len(train1) + len(test1))
Out[65]:
0.2505050505050505
In [66]:
len(test2) / (len(train2) + len(test2))
Out[66]:
0.25280898876404495
In [67]:
len(test3) / (len(train3) + len(test3))
Out[67]:
0.25
In [68]:
vs = vs.sample(frac=1)
In [69]:
ev = ev.sample(frac=1)
In [70]:
phev = phev.sample(frac=1)
In [71]:
train1
Out[71]:
Date Location Fuel_Type Vehicle_type Number_of_Cars_sold
26 2011 Canada Plug-in hybrid electric Passenger cars 303.0
1731 2015 Quebec Battery electric Passenger cars 1556.0
1948 2016 Canada Hybrid electric Multi-purpose vehicles 7845.0
2996 2018 Saskatchewan Hybrid electric Passenger cars 101.0
540 2012 New Brunswick Battery electric Total, vehicle type 1.0
... ... ... ... ... ...
1282 2014 Nova Scotia Hybrid electric Pick up trucks 0.0
2993 2018 Saskatchewan Battery electric Multi-purpose vehicles 8.0
2368 2017 Newfoundland and Labrador Hybrid electric Multi-purpose vehicles 0.0
2724 2018 Canada Plug-in hybrid electric Vans 1419.0
3068 2018 British Columbia and the Territories Hybrid electric Multi-purpose vehicles 3482.0

1113 rows × 5 columns

In [72]:
test1
Out[72]:
Date Location Fuel_Type Vehicle_type Number_of_Cars_sold
2606 2017 Saskatchewan Battery electric Passenger cars 10.0
64 2011 Newfoundland and Labrador Plug-in hybrid electric Vans 0.0
1737 2015 Quebec Hybrid electric Pick up trucks 0.0
2783 2018 Prince Edward Island Battery electric Multi-purpose vehicles 1.0
190 2011 Quebec Battery electric Total, vehicle type 64.0
... ... ... ... ... ...
3244 2019 New Brunswick Hybrid electric Vans 0.0
541 2012 New Brunswick Battery electric Passenger cars 1.0
1497 2014 Alberta Plug-in hybrid electric Pick up trucks 0.0
3026 2018 Alberta Battery electric Passenger cars 0.0
1592 2015 Newfoundland and Labrador Battery electric Pick up trucks 0.0

372 rows × 5 columns

In [73]:
train2
Out[73]:
Year Make Model Vehicle Class Motor(kW) City(kWh/100 km) Highway(kWh/100 km) City+Highway(kWh/100 km) Range(km) Recharge_Time(hours)
49 2016 Tesla Model S 70D Full-size 386 20.8 20.6 20.7 386 12.0
34 2016 BMW i3 Subcompact 125 15.2 18.8 16.8 130 4.0
48 2016 Tesla Model S 60D Full-size 386 20.6 19.6 20.2 351 10.0
74 2017 Tesla Model S 100D Full-size 386 20.7 20.5 20.6 539 12.0
72 2017 Tesla Model S 75D Full-size 386 20.6 19.9 20.3 417 12.0
... ... ... ... ... ... ... ... ... ... ...
156 2020 Tesla Model 3 Standard Range Plus Mid-size 211 14.1 15.9 14.9 402 8.5
159 2020 Tesla Model 3 Long Range AWD Mid-size 335 16.9 18.0 17.4 518 10.0
12 2014 Ford Focus Electric Compact 107 19.0 21.1 20.0 122 4.0
140 2020 Audi e-tron Sportback 55 quattro SUV: Standard 300 27.6 26.9 27.3 351 10.0
150 2020 MINI Cooper SE 3 Door Subcompact 135 18.2 20.9 19.4 177 4.0

133 rows × 10 columns

In [74]:
test2
Out[74]:
Year Make Model Vehicle Class Motor(kW) City(kWh/100 km) Highway(kWh/100 km) City+Highway(kWh/100 km) Range(km) Recharge_Time(hours)
117 2019 smart EQ fortwo cabriolet Two-seater 60 18.7 23.1 20.7 92 3.0
129 2019 Tesla Model S Long Range Full-size 398 18.2 19.5 18.8 595 12.0
132 2019 Tesla Model X 75D SUV: Standard 386 23.0 21.9 22.5 383 12.0
151 2020 Nissan LEAF (40 kWh) Mid-size 110 17.0 21.2 18.9 240 8.0
39 2016 Nissan LEAF (24 kWh) Mid-size 80 16.5 20.8 18.4 135 5.0
2 2013 Ford Focus Electric Compact 107 19.0 21.1 20.0 122 4.0
152 2020 Nissan LEAF S PLUS Mid-size 160 17.8 21.5 19.5 363 11.0
100 2018 Tesla Model S P100D Full-size 568 22.6 20.0 21.5 507 12.0
45 2016 Tesla Model S (70 kWh) Full-size 285 23.8 23.2 23.6 377 12.0
15 2014 smart fortwo electric drive cabriolet Two-seater 35 17.2 22.5 19.6 109 8.0
63 2017 Chevrolet Bolt EV Station wagon: Small 150 16.4 19.0 17.6 383 9.3
91 2018 smart fortwo electric drive cabriolet Two-seater 60 18.7 23.1 20.7 92 3.0
144 2020 Hyundai IONIQ Electric Mid-size 100 14.5 17.4 15.8 274 5.8
25 2015 Nissan LEAF Mid-size 80 16.5 20.8 18.4 135 5.0
78 2017 Tesla Model X 75D SUV: Standard 386 23.0 21.9 22.5 383 12.0
89 2018 Kia Soul EV Station wagon: Small 81 16.8 22.4 19.3 179 5.0
138 2019 Tesla Model X Performance (22" Wheels) SUV: Standard 580 26.2 27.1 26.6 435 12.0
85 2018 BMW i3s Subcompact 135 17.1 21.1 18.9 172 5.0
145 2020 Hyundai Kona Electric SUV: Small 150 16.2 19.3 17.4 415 9.0
167 2020 Tesla Model S Performance (19" Wheels) Full-size 580 20.1 20.2 20.2 560 12.0
55 2016 Tesla Model S P100D Full-size 568 22.6 20.0 21.5 507 12.0
52 2016 Tesla Model S 90D (Refresh) Full-size 386 20.8 19.7 20.3 473 12.0
1 2012 Nissan LEAF Mid-size 80 19.3 23.0 21.1 117 7.0
62 2017 BMW i3 (94 Ah) Subcompact 125 16.2 19.7 17.8 183 5.0
149 2020 Kia Soul EV (180 Ah) Station wagon: Small 150 16.8 21.1 18.6 383 9.5
17 2014 Tesla Model S (60 kWh) Full-size 225 22.2 21.7 21.9 335 10.0
153 2020 Nissan LEAF SV/SL PLUS Mid-size 160 18.3 22.1 20.0 349 11.0
18 2014 Tesla Model S (85 kWh) Full-size 270 23.8 23.2 23.6 426 12.0
171 2020 Tesla Model X Long Range Plus SUV: Standard 398 19.3 20.8 20.0 565 12.0
163 2020 Tesla Model S Standard Range Full-size 398 18.5 20.0 19.2 462 9.0
95 2018 Tesla Model 3 Long Range AWD Mid-size 335 17.4 18.7 18.0 499 10.0
22 2015 Ford Focus Electric Compact 107 19.0 21.1 20.0 122 4.0
47 2016 Tesla Model S (85/90 kWh) Full-size 285 23.8 23.2 23.6 426 12.0
16 2014 smart fortwo electric drive coupe Two-seater 35 17.2 22.5 19.6 109 8.0
51 2016 Tesla Model S 85D/90D Full-size 386 22.0 19.8 21.0 435 12.0
161 2020 Tesla Model 3 Long Range AWD Performance (19" Wheels) Mid-size 358 17.6 18.7 18.1 489 10.0
64 2017 Ford Focus Electric Compact 107 17.7 21.8 19.6 185 5.5
123 2019 Tesla Model 3 Long Range AWD Mid-size 335 17.4 18.7 18.0 499 10.0
59 2016 Tesla Model X P90D SUV: Standard 568 23.6 23.3 23.5 402 12.0
169 2020 Tesla Model X Standard Range SUV: Standard 398 20.0 21.5 20.7 415 9.0
56 2016 Tesla Model X 60D SUV: Standard 386 23.0 22.2 22.6 322 10.0
54 2016 Tesla Model S P90D (Refresh) Full-size 568 22.9 21.0 22.1 435 12.0
24 2015 Mitsubishi i-MiEV Subcompact 49 16.9 21.4 18.7 100 7.0
170 2020 Tesla Model X Long Range SUV: Standard 398 21.2 22.5 21.8 528 12.0
26 2015 smart fortwo electric drive cabriolet Two-seater 35 17.2 22.5 19.6 109 8.0
In [75]:
train3
Out[75]:
Year Make Model Vehicle Class Motor(kW) Engine Size(Litre) Cylinders Fuel Type(Electricity/Gasoline) Electricity/Gasoline(100km) Range(km) Recharge Time(hours) Fuel Type 2(Gasoline only) City(Litre/100km) Highway(Litre/100km) Highway&City(Litre/100km) Combined_Range(km) CO2 Emissions(g/km) CO2 Rating Smog Rating
37 2017 BMW 330e Compact 65 2.0 4 B/Z* 3.3 ([29.5 kWh + 0.0 L]/100 km) 23 2.0 Z 8.5 6.9 7.8 533 118 10.0 6.0
86 2019 BMW i8 Roadster Two-seater 105 1.5 3 B/Z* 3.4 ([30.6 kWh + 0.0 L]/100 km) 29 3.0 Z 9.2 8.0 8.7 488 118 10.0 3.0
39 2017 BMW i3 REx (94 Ah) Subcompact 125 0.6 2 B 2.1 (18.9 kWh/100 km) 156 5.0 Z 6.5 7.0 6.7 129 18 10.0 8.0
73 2018 MINI Cooper SE Countryman ALL4 Mid-size 65 1.5 3 B/Z* 3.6 ([31.4 kWh + 0.0 L]/100 km) 19 3.0 Z 8.4 8.8 8.6 420 139 9.0 3.0
18 2015 Porsche Panamera S E-Hybrid Full-size 71 3.0 6 B/Z 4.6 ([31.7 kWh + 1.0 L]/100 km) 26 3.0 Z 10.4 8.0 9.3 871 142 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
66 2018 Ford Fusion Energi Mid-size 35 2.0 4 B/X* 2.4 ([21.5 kWh + 0.0 L]/100 km) 35 2.5 X 5.5 5.7 5.6 947 70 10.0 7.0
14 2015 Ford C-MAX Energi Mid-size 35 2.0 4 B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.5 X 5.8 6.5 6.1 856 80 0.0 0.0
46 2017 Ford Fusion Energi Mid-size 35 2.0 4 B/X* 2.4 ([21.5 kWh + 0.0 L]/100 km) 35 2.5 X 5.5 5.7 5.6 947 70 10.0 8.0
63 2018 Cadillac CT6 Plug-in Hybrid Mid-size 178 2.0 4 B/Z* 3.8 ([33.7 kWh + 0.0 L]/100 km) 50 4.5 Z 10.2 8.1 9.2 642 89 10.0 3.0
88 2019 Chrysler Pacifica Hybrid Minivan 89 3.6 6 B/X* 2.9 ([25.8 kWh + 0.0 L]/100 km) 51 2.0 X 8.0 7.9 8.0 784 74 10.0 7.0

102 rows × 19 columns

In [76]:
test3
Out[76]:
Year Make Model Vehicle Class Motor(kW) Engine Size(Litre) Cylinders Fuel Type(Electricity/Gasoline) Electricity/Gasoline(100km) Range(km) Recharge Time(hours) Fuel Type 2(Gasoline only) City(Litre/100km) Highway(Litre/100km) Highway&City(Litre/100km) Combined_Range(km) CO2 Emissions(g/km) CO2 Rating Smog Rating
50 2017 Mercedes-Benz GLE 550e 4MATIC SUV: Standard 85 3.0 6 B/Z 5.5 ([42.7 kWh + 0.8 L]/100 km) 19 1.90 Z 11.8 10.3 11.1 719 184 7.0 6.0
91 2019 Hyundai IONIQ Electric Plus Mid-size 45 1.6 4 B/X* 2.0 ([17.7 kWh + 0.0 L]/100 km) 47 2.30 X 4.4 4.6 4.5 961 46 10.0 7.0
99 2019 Porsche Cayenne E-Hybrid SUV: Standard 99 3.0 6 B/Z* 5.1 ([45.5 kWh + 0.0 L]/100 km) 21 3.00 Z 11.4 10.0 10.8 698 169 7.0 3.0
45 2017 Ford C-MAX Energi Mid-size 35 2.0 4 B/X* 2.5 ([22.0 kWh + 0.0 L]/100 km) 33 2.50 X 5.8 6.2 6.0 884 76 10.0 8.0
101 2019 Porsche Panamera Turbo S E-Hybrid Full-size 70 4.0 8 B 4.9 (41.0 kWh/100 km) 23 3.00 Z 12.3 10.6 11.5 695 180 7.0 3.0
1 2013 Chevrolet Volt Compact 111 1.4 4 B 2.4 (21.4 kWh/100 km) 61 4.00 Z 6.7 5.9 6.4 550 45 0.0 0.0
135 2020 Volvo XC90 T8 AWD SUV: Standard 65 2.0 4 B/Z* 4.0 ([36.1 kWh + 0.0 L]/100 km) 29 3.00 Z 9.1 8.4 8.8 813 120 10.0 7.0
3 2013 Ford Fusion Energi Mid-size 35 2.0 4 B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.50 X 5.8 6.5 6.1 856 80 0.0 0.0
94 2019 Kia Niro Plug-in Hybrid Station wagon: Small 45 1.6 4 B/X* 2.2 ([19.7 kWh + 0.0 L]/100 km) 42 2.25 X 4.9 5.3 5.1 853 56 10.0 7.0
130 2020 Toyota Prius Prime Mid-size 71 1.8 4 B/X* 1.8 ([15.8 kWh + 0.0 L]/100 km) 40 2.00 X 4.3 4.4 4.3 995 49 10.0 7.0
112 2020 BMW i3s REx (120 Ah) Subcompact 135 0.6 2 B 2.4 (19.9 kWh/100 km) 203 7.00 Z 7.8 7.5 7.7 116 14 10.0 7.0
79 2018 Volvo XC90 T8 AWD SUV: Standard 65 2.0 4 B/Z* 3.9 ([34.4 kWh + 0.0 L]/100 km) 27 3.00 Z 9.6 8.6 9.2 547 130 9.0 7.0
134 2020 Volvo XC60 T8 AWD SUV: Small 65 2.0 4 B/Z* 4.2 ([38.2 kWh + 0.0 L]/100 km) 27 3.00 Z 9.5 8.7 9.1 781 128 9.0 7.0
109 2020 BMW 530e xDrive Compact 83 2.0 4 B/Z* 3.6 ([32.2 kWh + 0.0 L]/100 km) 31 3.00 Z 10.3 8.1 9.3 500 120 10.0 7.0
110 2020 BMW 745Le xDrive Full-size 83 3.0 6 B/Z* 4.2 ([37.7 kWh + 0.0 L]/100 km) 26 3.00 Z 12.2 9.1 10.8 435 152 8.0 3.0
21 2016 Audi A3 e-tron ultra Compact 80 1.4 4 B/Z* 2.7 ([23.9 kWh + 0.0 L]/100 km) 27 2.50 Z 6.4 5.7 6.1 663 85 10.0 0.0
40 2017 BMW i8 Subcompact 96 1.5 3 B/Z 3.1 ([26.9 kWh + 0.2 L]/100 km) 24 2.00 Z 8.4 8.1 8.3 509 123 10.0 6.0
44 2017 Chrysler Pacifica Hybrid Minivan 89 3.6 6 B/X* 2.8 ([24.9 kWh + 0.0 L]/100 km) 53 2.00 X 7.3 7.2 7.3 858 66 10.0 8.0
54 2017 Volvo XC90 T8 AWD SUV: Standard 65 2.0 4 B/Z* 4.8 ([47.3 kWh + 0.0 L]/100 km) 21 3.00 Z 10.0 8.8 9.5 529 148 8.0 8.0
78 2018 Volvo XC60 T8 AWD SUV: Small 65 2.0 4 B/Z* 4.0 ([34.9 kWh + 0.0 L]/100 km) 27 3.00 Z 10.1 8.5 9.4 534 133 9.0 7.0
76 2018 Toyota Prius Prime Mid-size 71 1.8 4 B/X* 1.8 ([15.8 kWh + 0.0 L]/100 km) 40 2.00 X 4.3 4.4 4.3 995 49 10.0 7.0
61 2018 BMW i3s REx Subcompact 135 0.6 2 B 2.2 (19.1 kWh/100 km) 156 5.00 Z 6.6 7.1 6.8 129 18 10.0 7.0
52 2017 Porsche Cayenne S E-Hybrid SUV: Standard 71 3.0 6 B 5.1 (44.2 kWh/100 km) 23 3.00 Z 11.3 9.8 10.6 768 160 8.0 6.0
132 2020 Volvo S90 T8 AWD Mid-size 65 2.0 4 B/Z 3.7 ([31.4 kWh + 0.2 L]/100 km) 34 3.00 Z 8.3 7.5 7.9 761 99 10.0 7.0
84 2019 BMW i3s REx (120 Ah) Subcompact 135 0.6 2 B 2.4 (19.9 kWh/100 km) 203 7.00 Z 7.8 7.5 7.7 116 14 10.0 7.0
29 2016 Ford C-MAX Energi Mid-size 35 2.0 4 B/X 2.7 ([23.2 kWh + 0.1 L]/100 km) 32 2.50 X 5.8 6.5 6.1 856 80 10.0 0.0
59 2018 BMW 740e xDrive Full-size 83 2.0 4 B/Z* 3.6 ([32.1 kWh + 0.0 L]/100 km) 23 3.00 Z 9.5 8.0 8.8 525 133 9.0 3.0
106 2020 Audi Q5 TFSIe SUV: Small 105 2.0 4 B/Z* 3.6 ([32.2 kWh + 0.0 L]/100 km) 32 2.40 Z 9.5 8.1 8.9 615 103 10.0 7.0
56 2018 BMW 330e Compact 65 2.0 4 B/Z* 3.3 ([29.5 kWh + 0.0 L]/100 km) 23 2.00 Z 8.5 6.9 7.8 533 118 10.0 3.0
28 2016 Chevrolet Volt Compact 111 1.5 4 B 2.2 (19.9 kWh/100 km) 85 4.50 X 5.5 5.6 5.6 591 32 10.0 0.0
26 2016 Cadillac ELR Subcompact 174 1.4 4 B 2.7 (24.4 kWh/100 km) 64 5.00 Z 7.6 7.1 7.4 480 57 10.0 0.0
102 2019 Toyota Prius Prime Mid-size 71 1.8 4 B/X* 1.8 ([15.8 kWh + 0.0 L]/100 km) 40 2.00 X 4.3 4.4 4.3 995 49 10.0 7.0
17 2015 Porsche Cayenne S E-Hybrid SUV: Standard 71 3.0 6 B 5.0 (43.2 kWh/100 km) 24 3.00 Z 11.3 9.8 10.6 758 162 0.0 0.0
81 2019 BMW 530e xDrive Compact 83 2.0 4 B/Z* 3.5 ([30.2 kWh + 0.0 L]/100 km) 24 2.00 Z 8.8 7.7 8.3 554 124 10.0 7.0

Train the model using training data

In [77]:
slope, intercept, r_value, p_value, slope_std_error = stats.linregress(
    train1.index, train1['Number_of_Cars_sold'])
In [78]:
train_regr = intercept + slope * train1.index

scatter(train1.index, train1['Number_of_Cars_sold'])
plt.plot(train1.index, train_regr, color='red')
Out[78]:
[<matplotlib.lines.Line2D at 0x229b328ebe0>]
In [79]:
slope, intercept, r_value, p_value, slope_std_error = stats.linregress(
    train2.index, train2['Range(km)'])
In [80]:
train_regr = intercept + slope * train2.index

scatter(train2.index, train2['Range(km)'])
plt.plot(train2.index, train_regr, color='red')
Out[80]:
[<matplotlib.lines.Line2D at 0x229b3713700>]
In [81]:
slope, intercept, r_value, p_value, slope_std_error = stats.linregress(
    train3.index, train3['Combined_Range(km)'])
In [82]:
train_regr = intercept + slope * train3.index

scatter(train3.index, train3['Combined_Range(km)'])
plt.plot(train3.index, train_regr, color='red')
Out[82]:
[<matplotlib.lines.Line2D at 0x229b3bb43a0>]
In [83]:
!jupyter nbconvert EV.ipynb --to 'html'
[NbConvertApp] Converting notebook EV.ipynb to html
[NbConvertApp] Writing 3795473 bytes to EV.html